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Recurrent networks have been proposed as a model of associative mennory. In sucli 
models, memory items are stored in tine strengtin of connections between neurons. 
These modifiable connections or synapses constitute a shared resource among all stored 
memories, limiting the capacity of the network. Synaptic plasticity at different time scales 
can play an important role in optimizing the representation of associative memories, by 
keeping them sparse, uncorrelated and non-redundant. Here, we use a model of sequence 
memory to illustrate how plasticity allows a recurrent network to self-optimize by gradually 
re-encoding the representation of its memory items. A learning rule is used to sparsify 
large patterns, i.e., patterns with many active units. As a result, pattern sizes become 
more homogeneous, which increases the network's dynamical stability during sequence 
recall and allows more patterns to be stored. Last, we show that the learning rule allows 
for online learning in that it keeps the network in a robust dynamical steady state while 
storing new memories and overwriting old ones. 
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INTRODUCTION 

Memories are based on synaptically induced changes of intrinsi- 
cally generated brain activity. Examples for such intrinsic activ- 
ities are the recurring sequences of neuronal activity patterns 
in the hippocampus (Wilson and McNaughton, 1994; Nadasdy 
et al., 1999; Lee and WUson, 2002; Davidson et al, 2009); see 
Buhry et al. (2011); Wikenheiser and Redish (2012) for review. 
Classically, these sequences were interpreted as replaying previous 
activity patterns. Meanwhile they have been found to also pre- 
play future behavior (Diba and Buzsaki, 2007) or reverse replay 
past behavior (Foster and Wilson, 2006; Diba and Buzsaki, 2007). 
More recently, it has been shown that they even predict future 
behaviors (Gupta et al., 2010; Dragoi and Tonegawa, 2011, 2013; 
Pfeiffer and Foster, 2013). The diversity of these sequences has 
generated an equally diverse set of possible functional expla- 
nations, ranging from memory consolidation (Nakashiba et al., 
2009; Jadhav et al., 2012) to memory deletion (Hoffman et al., 
2007) and path planning (Azizi et al., 2013; Ponulak and Hopfield, 
2013). 

In this paper, we wOI specifically address one variant of the 
memory consolidation and deletion hypothesis, viz. whether 
these sequences can be used to drive a learning rule that allows for 
efficiently re-encoding memories and thereby solve the problem 
of catastrophic forgetting. The basic idea of this hypothesis is that 
new memories might be encoded by assemblies that are not opti- 
mally sparse and thus allow secure retrieval. A retrograde learning 
rule that propagates long-term depression (LTD) will be shown 
to be able to reduce these assemblies toward a level of sparseness, 
which is optimal from the retrieval point of view and, at the same 
time, allows the network to operate in a stable regime of online 
learning, in which old memories are overwritten by new ones. 



This learning rule operates on a time scale that is slower than the 
fast time scale of initial imprinting. As a result, new memories 
wOI be represented by a larger number of neurons (and synapses) 
than old memories, which are encoded more efficiently and will 
eventually be forgotten. 

MATERIALS AND METHODS 

Here, we investigate memory consolidation and retrieval in 
a network which stores sequential associations of binary pat- 
terns (Nadal, 1991; Gibson and Robinson, 1992; Hirase and 
Recce, 1996; Leibold and Kempter, 2006; Kammerer et al., 
2013). As in these previous papers, the dynamics is for- 
mulated in discrete time. The individual time steps can be 
biologically interpreted as the cycles of a collective network oscil- 
lation (e.g., hippocampal ripple oscillations; Maier et al, 2011). 
The employed network model is identical to that described 
in Medina and Leibold (2013) and lays particular emphasis on 
handling heterogeneous pattern sizes, i.e., the number of active 
neurons at any time may be different. Formally, this is expressed 
by the vector of coding ratios 

<^={/0,/l,/2,...,/p} (1) 

where Mk = fkN is the number of active neurons in the A:-th 
binary pattern fj- e {0, 1}^, N is the number of neurons in 
the network, and the indices k = 0, . . . , P represent each of the 
P -|- 1 patterns that are connected by the P pairwise directed 
associations. Unless mentioned otherwise, the coding ratios ft 
are randomly drawn from a gamma distribution (to avoid neg- 
ative patterns sizes) with mean coding ratio 0o and standard 
deviation . 
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The associations between the individual patterns of the 
sequence fo, ?2, ■ • ■ are stored in the synaptic weight matrix, 
which is chosen according to a clipped Hebbian rule (Willshaw 
et al., 1969): a synapse from neuron j to i has weight Sy = 0 only 
if a spike of neuron never follows one of neuron j in any of 
the P associations, otherwise Sy = 1. In addition to this Willshaw 
rule, we also allow for a morphological connectivity, i.e., a synapse 
from neuron; to neuron i only exists with probability (Gibson 
and Robinson, 1992; Leibold and Kempter, 2006). This implies 
a second set of binary synaptic variables Wy, with Wy = 1 if the 
respective synapse exists and Wy = 0 otherwise. For such a learn- 
ing rule and heterogeneous pattern sizes, it was shown in Medina 
and Leibold (2013) that the probability c of a potentiated synaptic 
connection (Sy = 1 ) equals 

c = c^ ^1- n (l-M-i)^ ■ (2) 



A s„=0 s,;=l B 




Synaptic state 

FIGURE 1 I Synaptic metaplasticity. (A) Synaptic meta levels with serial 
state transitions. Learning initially potentiates the state {solid lines), 
whereas LTD signals decrement it {dotted lines) with probability q. (B) 
Distribution of synaptic states after storage of P patterns. The higher the 
network load, the more likely synapses are to be in higher meta levels. 
Parameters: A;= 10^, Cm = 0.1, 0o = 0.02, and = 0.1 0o- 



In this and related models, the choice of binary synapses facil- 
itates the mathematical tractabUity of the theory, although, in 
biology, synaptic weights generally follow long-tailed distribu- 
tions (Song et al, 2005). The long tail, however, allows one to 
subdivide synapses into weak and strong ones, which could be 
considered as being approximated by a noisy binary approach. 

SYNAPTIC METAPLASTICITY 

According to Willshaw's learning rule, a synapse is in the potenti- 
ated state (Sy = 1 ) if it connects two neurons that fire in sequence 
at least once. However, some neuron pairs may fire in sequence 
multiple times if they are part of the representation of consecutive 
patterns more than once. Although disregarded so far, the num- 
ber of times a neuron pair fires in sequence is important since 
it tells us how many associations rely on this connection being 
potentiated. In order to conserve this information while using 
binary synapses, we consider synaptic meta levels with serial state 
transitions, a model similar to that proposed in Amit and Fusi 
(1994); Leibold and Kempter (2008). 

A state diagram of our plasticity model is shown in Figure lA. 
After a synapse has been potentiated once, every further occur- 
rence of sequential firing in the sequence activation schedule 
increments the meta level by one, leaving the synaptic weight Sy 
unchanged. Figure IB shows the distribution of synaptic states in 
the network for three different pattern loads P. At higher loads, 
synapses are more likely to reach higher meta levels. 

NETWORK DYNAMICS 

Following Medina and Leibold (2013), neurons are mod- 
eled using a simple threshold dynamics that translates the 
synaptic matrix /y = Sy Wy into an activity sequence: a neu- 
ron i fires a spike at cycle f -|- 1 if its postsynaptic potential 
hi{t) = X!jl= 1 (^ySy ~ b)xj{t) at time t exceeds the threshold 
9. Here, Xj(t) e {0, 1} represents the binary state of neuron j at 
time f and b denotes the strength of a linear instantaneous feed- 
back inhibition (Hirase and Recce, 1996; Kammerer et al, 2013). 
The negative feedback constant is chosen = c for all subsequent 
simulations (Medina and Leibold, 2013). 



To save computational time, most of the upcoming results are 
derived in a mean field approximation. To this end, in each time 
step, neurons are subdivided into two populations: an On popula- 
tion which is supposed to fire according to the sequence schedule 
and an Off population which is supposed to be silent (Leibold 
and Kempter, 2006). The number of active neurons at time step t 
can thus be divided into a number rrit of correctly activated 
neurons (hits) and a number rit of incorrectly activated neu- 
rons (false alarms). Using these conventions yields the mean field 
dynamics (Medina and Leibold, 2013) 



(mt+i, «f+i) = (Ton (mt, n,) , Tos(tn,, nt)) 



with 

Tonintt, tit) = Mt+i $ 



/ fiQn -b{mt + rit) - 0 



roff(OTt,«f) = (N-Mf+i)<D 



fiQff-bjmt + rit) 

CTOff 



(3) 

(4) 
(5) 



and 0(z) = [1 -F erf(z/V2)]/2 denoting the cumulative dis- 
tribution function of the normal distribution. Here, the 
mean number of synaptic inputs /x = (h{t)) and the variance 
cr^ = (hit)^) - {h{t))^, for the On population, are 



MOn = Cmttlt + Cm grit 



(6) 



<Tq — CmW1t(l ~ Cm) 



+ Cm S nt(^l - c„ g + c„, g (rit - l)j (7) 

with g = c/c,„; see Equation (2). The analog expressions for the 
Off population are 

IJ- Off = Cm g {rrit + nt) (8) 

O-Qff = '^m S" ('"t + «f)(l - Cmg + Cm ? («( + «f " l))-(9) 
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Finally, the variability coefficient used in Equations (7) and (9) 
is given by 



vi 



-(2f-i+n(i-A(2/it-i-/^i))) 



1. (10) 



RETROSYNAPTIC LTD 

The replay model in Medina and Leibold (2013) assumed the 
synaptic matrix /y to remain constant. Synaptic plasticity may, 
however, take place on a slower time scale and change network 
dynamics between consecutive replay events. In this paper, we 
investigate the idea that replay evokes a retrosynaptic LTD to 
achieve a more efficient utilization of synaptic resources, thereby 
increasing storage capacity. We therefore assume that the stored 
patterns are initially too large and, over time, are reduced by 
learning such that the coding ratios fk converge to an optimal 
value. 

This idea is implemented as shown in Figure 2. During replay 
of association — > ^t+\ (Figure 2A), active cells that receive 
excessive synaptic input send a retrosynaptic LTD signal to all 
presynaptic cells which were active in the previous time step. The 
emission of such a signal is modeled as a stochastic process in 
which the emission probability i/f increases with the number h of 
synaptic inputs received by the cell like 



otherwise. 



(11) 



To combine this learning rule with the mean field network 
dynamics, we have to find an expression for the probability that 
a presynaptic cell receives at least one retrosynaptic signal. The 
number of inputs received at time f + 1 is on average on for an 
On cell, and /x off for an Off cell. Thus, for an On cell, we have a 
probability (l — V^(/U.On))'^'"™'^' of receiving no retrograde LTD 
signal from any active cell in the On population, and a probability 
(l — iA(m off))'^'""'^' of receiving no retrograde LTD signal from 
the Off population. Similarly, for an Off cell, these probabilities 
are (l - lA(MOn))''""'""' and (l - fiiios))'"'"'^', and thus 



pOff ^ 1 



/ . . \ c,„m,+ i / \c„n,+i 

{l-if{llOr)) (l-V^(MOff)) (12) 

l-^(MOn)) (l-iA(Moff)) .(13) 



As illustrated in Figure 2B, upon receiving one or more LTD sig- 
nals, with probability q, the presynaptic cell decrements by one the 
meta state of all its input synapses from the mt-i + ttt-i cells 
that were active in the previous time step, as well as the meta state 
of all its output synapses to the OTf _|- 1 + Mt_|- 1 cells that are active 
in the following time step. Each synapse in the subset of decre- 
mented synapses therefore takes part in one association less. As 
a result, the cell no longer takes part in the neural representation 
of pattern ^f, although it might still be spuriously activated dur- 
ing replay. On average, the size of pattern §f is therefore updated 
according to 



(14) 



Here the parameter ho defines a minimal pattern size M = ho/Cm 
beyond which plasticity signals can occur. Its choice determines 
the optimal memory capacity of the network, as this minimal 
pattern size can become a stable fixed point of the dynamics of 
pattern sizes. 



where * = 1 - <j 
ONLINE LEARNING 

If associations are stored in the network one after another (online 
learning), new memories wiU overwrite old memories (Nadal 
et al., 1986; Amit and Fusi, 1994), which is also known as 
palimpsest learning, and thereby the connectivity between On 
neurons of old associations is increasingly diluted. The remain- 
ing signal strength of an association k depends on the probability 
Yk = I — p(0|fc) that a synapse is not in state zero, given that it 
participates in association ^k+i (i-e-i it connects neurons 
that fired in sequence during the storage of that association). To 
account for overwriting, the dynamical Equations (6) and (7) are 
modified as follows 

fJ- On = Cm yttnt + €„, g tit (15) 

O-Qn = Cmytnitil -Cmyt) 

+ c„jgn,(l-c„,g + V^c,„gin,-l)y (16) 

In our model framework, the synaptic connectivity is changed in 
two ways. First, during imprinting of a new association, synapses 
increment their meta state level. Second, synaptic states are decre- 
mented via retrosynaptic LTD. To capture these changes, we 
define the average state distribution p(s), which describes the 
probability that an arbitrarily chosen synapse is in state 5, and thus 
c/c,„ = 1 - p(0). 



A ► retrosynaptic B ; synaptic state 

LTD signal decrement 




FIGURE 2 I Retrosynaptic LTD. (A) During sequence replay, excessive 
depolarization h at time t+ 1 triggers a retrosynaptic LTD signal that is 
propagated with probability to all presynaptic cells that were active at 
time f (black squares denote hits, gray squares denote false alarms). (B) 
Each cell receiving an LTD signal responds with probability q by 
decrementing the state of all its input synapses from cells that fired at time 
f — 1 and all its output synapses to cells that fired at time t + 1 . 
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Effect of synaptic potentiation on state distribution 

If a new association is added that links pattern to pattern ^k+i^ 
a random synapse increases its state with probability 1 and 
thus the change in the state distribution is 



where 



/-I 0 0 
1 -1 0 
0 1 -1 
0 0 1 



(17) 



-1 0 
1 0/ 



(18) 



and H is the unit matrix. 



Effect of synaptic depression on state distribution. 

Conversely, retrograde LTD is described by the matrix multiplica- 
tion 

p^Ap (19) 



where 



/I Pi+p'i 

0 i-pi-p; 



Pi 0 

P2 Pi 
I-P2-P2 Pi 

0 l-pi-p'i 

: 0 



Pi 0 

pi P's 



\ 



pp 

i-pp-p'pj 



(20) 



and ps, p's are the probabilities that a replay event decreases the 
state s of a random synapse by 1 and 2, respectively. 

To obtain ps andpj, we define the probability p(s, ], ) = p{s) ps 
that a synapse is in state s and receives the signal to go down one 
meta level. Similarly, p(s, ]rl ) = p(s)p[ is the probability that a 
synapse is in state 5 and receives the signal to go down two meta 
levels. 

A depression event 4- during replay of the association 
Hk—^ Hk+\ can have two origins: (1) the depression signal l'*^"""' 
that is sent by a neuron of pattern §fc to its output synapses, 
and (2) the depression signal 4-*'^"' the neuron sends to its input 
synapses. Since a subpopulation of synapses maybe part of both 
the inputs and the outputs of pattern ^k, a synapse may be 
depressed twice and thus go down two levels. Since the patterns 
are statistically independent, both depression events are indepen- 
dent and thus the probability that a synapse is in state s and goes 
down by two levels upon replay of association ^jt ^ §i+ 1 is given 
by 

Pis, M) = p(;''^+' \s) 15) Pis). (21) 



Similarly, the probability p(s, — ) that a synapse stays in 
state s is 

Pis, -) = (1 - p(;<'^+) Is)) (1 - ^(iC^-' Is)) Pis). (22) 

A synapse either stays in state s, it goes down by one state or goes 
down by two states, and thus 



pis) = pis, -) + p(s, i)+p(s, U)- 



(23) 



This normalization condition then yields the probability 
pis, ], ) that a synapse is in state 5 and is decreased by 
one, viz. 

Pis, ;) = p(s, ;('^+') + p{s, I*'-- ') - 2 p(s, u) ■ (24) 

The probabilities pis, ^''^^^ ) can be farther split up into two 
non-overlapping subsets of synapses, one (called k) that connects 
the On populations of association k and another one (called k) 
denoting all other synapses. Therefore we have 

p(^s, i''^^') = p(s, fc) +p(s, k) (25) 
p(^s, ;<*=-') = p(s, fc - 1) +p(s, fe^) . (26) 

Since the LTD signal 4- is independent of the synapse state s, we 
have 

p(s,|('^+',fc) =p(s|fc)p(|('=+),fc) (27) 

p(s,;('^+\fc) =p(s|fc)p(|('^+\fc) (28) 

and in analogy for fc — 1. The last terms on the right-hand side 
are obtained from equations (12) and (13) as follows 



On trikm+i 

N2 



p(i^'^\k)=qP^ 



N2 

Off nk imk+ 1 + nk+ 1) 
N2 



and 



On mm-1 

N2 



p[l^'^-\k-l)=qP, 



N2 

Off tikimk-i + nk-i) 
N2 



(29) 



(30) 



(31) 



(32) 



What remains to be obtained in Equations (27) and (28) are 
the conditional probabiUties p(s|fc) and p(s|fc). From heuristic 
considerations, we approximate 



p(s|fc) = p{s - l\k) Tk + p(s|fc) (1 - rjt) . 



(33) 
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Equation (33) assumes that the presence of association 

can affect the conditional state distribution p{s\k) 
in two ways: either it increases the state by one (pis — l\k) r^), 
or it has no effect on the state {p{s\k) {I — rj^)). The con- 
stants rj: can be interpreted as the fraction of synapses for 
which association i contributes to the next meta level. 

We will refer to them as the remaining memory strength of 
association k. 

Combining equation (33) with 



p{s)=p{s\k)p{k)+p{s\k)p(k) 
we can recursively compute 

,,7, p{s)-p{s-\\k) rkfkfk+\ 

P(s\k) = Try 

i-rkfkfk+i 

and in particular provide a connection between and yk via 



(34) 



(35) 



P(0) 



1 - rkfkfk+ 1 



= pm) = 



/Q(o)-M-n(i-n) 



(36) 



Effect of synaptic depression on signal connectivity 

In addition to changes in the state distribution p that describes 
the noise connectivity during associations, retrosynaptic LTD 
also specifically influences the synapses between the On popula- 
tions according to y/ = 1 — p{0\l). For more recent associations 
yi wiU be large, whereas for older associations yi will be small. 
The change in yi that results from retrosynaptic LTD while 
replaying association |jt -> |jt+ 1 is computed from the change 
in pm), 

pm) ^ pm) + {pa \hi)+p(u 11,0)^(110 

+ piU\2,l)pm). (37) 

For associations {k — l,k} the conditional probabilities of 
depression are independent of the association /, i.e., pi], \s, I) = 
p(i k) = ps and p(.\r\- \s,l) = p's- The conditional state occu- 
pancies are obtained via the r-factors from Equation (36) as 
p(s\l)=pis-l\i)r,+p{s\l){l-ri). 

For associations k — I and k, synapses can only experience by- 
chance LTD from one of the two signals (association k — 1 from 
^(^+) and association from 4, ' ) , since LTD from the other sig- 
nal would result in a decrease of the pattern size (with undiluted 
connectivity). Likewise there is no double decrement for these 
associations. As a result, the update rules for these associations 
are 

p(0|fc-l)^p(0|fc-l)+p(|(^+^ |l,fc-l) p(l|fc-l) 

= pm--^)+p{i^'^\i\k-i) 
= p(o|fc-i)-t-p(;('^+',i,fc|fc-i) 
+p(|(*^+\i,fc|fc-i) 



= pm - 1) + k\k - Ij p{l\k, k-l) 

p(llfc,fc-l) 
=pm-i)+p{i^'^\k) p(i|fc,fc-i) 

+p(^i^''^\k) p{l\lk-l) (38) 
and, replacing k— 1 by fc, 

p(o|fc)^p(o|fc) + p(;(*^-' |i,/c) p{i\k) 

= p(0|fc)+p(|('=-\j:-l) p{l\k,k-l) 

-fp(;<*^-\fc^)p(i|fc,fc^). (39) 



The probabilities p(l I fe, fc- l),p(l|fc, fe- 1) andp(l|fc, fe- 1) in 
Equations (38) and (39) can be obtained in analogy to 



p{l\k,k-l) 



p(l,fc,fc-l) ^p(fc|l)p(fc-l|l)p(l) 
p(fc, fc-1) p(fc,fc-l) 
p(l|fc)p(l|fc-l) 
P(l) 



(40) 



due to statistical independence of the patterns. 



Effect of synaptic depression on subtliresliold variance 

The dynamics of sequence replay not only depends on the 
mean connectivities c and Cmyk but also on the second 
moment of the connectivity matrix as captured by from 
Equation (10). Retrosynaptic LTD will also affect this sec- 
ond moment. As an approximation, we again use the r-factors 
from Equation (36), which are an estimate of the fraction of 
presynaptic On neurons that contribute to the meta level of 
association k. Thus, we can replace the coding ratio fic-i in 
Equations (2) and (10) by the diluted coding ratio rk-ifk-i and 
obtain 



V' 



+ n - rk-ifk-ifk{2 - rk-ifk-i))) - I. 
k=i ' 



(41) 



Since the definition of the r-factors in Equation (33) implements 
only an approximation, the two ways of computing the mean 
connectivities via c/Cm = 1 — p(0) and c/c^ = 1 — 11* = 1 (1 ~ 
fkfk-i^k-i) are slightly different. To achieve numerical robust- 
ness we obtain p(0) by applying Newton's method to solve the 
implicit Equation 

piO) = n f 1 -M-i ^W-^^-^*-) ) (42) 
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for p(0) in which the r-factors have been expressed via 
Equation (36). 

RESULTS 

RETROSYNAPTIC LTD DURING SEQUENCE REPLAY SPARSIFIES LARGE 
PAHERNS 

The mean field description for the pattern size changes from 
Equation (14) can be interpreted as a dynamical system itself, 
since it constitutes a discrete-time iterated map on the pattern 
sizes. The time scale of this dynamics is slower than the time 
scale of sequence replay since, during the replay of a sequence. 



the pattern sizes change only by a small amount. Figures 3A,B 
show the temporal evolution of the sizes of some example patterns 
and of the full distribution of pattern sizes for q = 0. 1 that results 
from the mean field Equation ( 14). The simulations show that the 
pattern sizes converge to a common fixed point and, as a result, 
the pattern size distribution becomes delta-like. For such homo- 
geneous pattern sizes the memory capacity is maximized (Medina 
and Leibold, 2013). 

To more systematically analyze plasticity on the slow time 
scale, we revisit dynamical Equation (14) and interpret it as a 
one-dimensional iterated map Mt — > * Mf . Figure 3C visualizes 
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FIGURE 3 I Retrosynaptic LTD signals sparsify and homogenize the 
pattern size distribution. (A) Evolution of the sizes of a few sample 
patterns for (7 = 0.1. (B) Evolution of the full pattern size distribution, 
initially {blue), sizes are very inhomogeneous (cr^/0o = 10%). After 5 
iterations [green] of plasticity, patterns are encoded more sparsely and the 
sequence becomes more homogeneous. After 10 iterations {red), the 
sequence is essentially homogeneous. (C) The plasticity map given by 
(14), showing how a pattern size Mt is sparsified as a function of q. The 
curves shown were obtained with parameters a = 2.5- 10"^ and ho = 100 



in (11) and setting mf+i = <poN and = 0 in (12). For too high q 
values {q > 0.2), the fixed point of the dynamics of pattern sizes at 
M = ho/Cm = 1000 is surpassed. (D) Connectivity decreases as a result of 
plasticity sparsifying stored patterns (P = 2500). The rate of decrease 
increases with q. (E) Minimal value of Cm* M as a function of q for 
different values of Cm and constant ho- At the critical value of q the curves 
bend down in a non-differentiable way. (F) Critical LTD probability qc as a 
function of Cm for constant ho. Other parameters: N = 10^, ipo = 0.02, and 
Cm = 0.1 unless mentioned otherwise. 
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FIGURE 4 I Plasticity of pattern sizes increases the dynamic stability 
during sequence replay. In all graphs, we show the fraction mt/Mt of hits 
{blue) at time step t and the fraction nt/{N — Mt) of false alarms (red) 
during the replay of a sequence (only the first 100 time steps are shown), 
using the mean field model of Equations (3) and following. Left to right: 
increasing plasticity iterations. Bottom to top: increasing firing thresholds 0. 
The initial pattern size distribution had parameters 0o = 0-02 and 

: 10%. Other parameters were W = 10^, Cm = 0.1 and P = 2500. 



the iteration function * M for different values of q. For small q, 
the fixed-point Equation M = ^ M has a solution for a maximal 
pattern size M = ho/Cm, which serves as an attractor of the dis- 
crete dynamics for all starting values M > /jo/Cm- If q is too high, 
the iteration function bends down for M > /io/Cm and there is no 
longer a single fixed point for all initial pattern sizes M > ho/Cm- 
The critical value of q is thus determined by the condition 

minM>hoA„(*M) < /jo/c„ (43) 

which means that the minimum of the iteration function ^ M 
for M > ho/ Cm is smaller or equal ho/Cm- The critical value qc is 
the smallest value of q for which condition (43) is fulfilled and is 
indicated by the kink of the graphs in Figure 3E. For larger q the 
iterated map can produce pattern sizes below ho/c„,, which are 
then marginally stable fix points but the resulting pattern sizes 
may be too small for successful replay. The critical q^ is not uni- 
versal and depends on parameters. Most importantly, it decreases 
with Cm and a (Figure 3F). The critical value remained above a few 
percent for a wide range of parameters. Specifically, in sparsely 
connected networks (c,„ ^1), the choice q ~ 0.05 is generally 
subcritical and thus allows for an optimal storage capacity. 

The dynamics of pattern sizes is paralleled by a dynamics of 
the mean network connectivity from Equation (2); Figure 3D. A 
reduction of the pattern sizes leads to a corresponding decrease in 
connectivity. The rate of this decrease is higher for higher values 
of q. For subcritical values of q {0 < q < qc) the average con- 
nectivity converges to a fixed point that is independent of q. For 
supercritical q{l > q > qc) the connectivity converges to a lower 
fixed-point connectivity, indicating a substantial fraction of too 
small pattern sizes. In the extreme case q = 1 all synapses are 
depotentiated and the connectivity converges to 0. 

PLASTICITY DURING SEQUENCE REPLAY INCREASES DYNAMIC 
STABILITY 

The changes in connectivity due to retrosynaptic LTD are paral- 
leled by changes of fast dynamics of sequence replay according to 
Equations (3) as exemplarily illustrated for three different plastic- 
ity stages (initial, after 5 and 10 iterations) and firing thresholds in 
Figure 4. As plasticity proceeds and the pattern size distribution 
in the sequence becomes more homogeneous, the activity fluc- 
tuations during replay are reduced and, eventually, allow for the 
whole sequence to be retrieved successfully. 

In the example of Figure 4, learning extends the range of 
thresholds under which the network successfully replays the 
fuU sequence if the network was perfectly initialized (mo = Mq, 
mq = 0). For a large threshold (e.g., 0 = 55), learning allows for 
the emergence of ongoing sequence replay in a regime where ini- 
tially no self-sustained network activity was possible. Before any 
plasticity takes place, the pattern sizes are highly inhomogeneous 
and the network falls silent almost immediately. After 5 plastic- 
ity iterations, fluctuations are reduced and the network is able 
to successfully retrieve more items in the sequence. Near per- 
fect pattern retrieval [mt/Mt = 1 and rit/iN — Mt) = 0) is made 
possible after 10 iterations. Similarly, for low thresholds (e.g., 
6 = 25), replay initially drives the network into an epileptic state 



{nit/Mt ~ nt/(N — Mt) ~ 0.5). The reduction of pattern sizes 
due to learning, again, allows for ongoing sequence replay. 
Defining the retrieval quality (Leibold and Kempter, 2006) 

Ft = mt/Mt - nt/(N - Mt) (44) 

as the relative difference between hit ratio and false alarm ratio, 
allows a better comparison of the replay performance for a large 
set of parameter choices. Formally this is done via the replay suc- 
cess rate, which is the fraction of runs for which at time f the replay 
quality Tt is above 0.5 (Medina and Leibold, 2013). 

Figure 5A shows the evolution of replay success rates for three 
plasticity stages and three different memory loads P. Initially, the 
pattern sizes are large and inhomogeneous, and ongoing sequence 
replay is not possible. Only for small loads [P = 2500) and for 
a small firing threshold range (9 ~ 45), can the first items be 
retrieved with high probability. As plasticity reduces inhomo- 
geneity and sparsifies the patterns, the range of firing thresholds 
9 for which the full sequence can be retrieved expands. This is 
made possible by a decrease in the noise connectivity c, shown in 
Figure 5B and verified through cellular simulations. In a modified 
model without synaptic meta states there was no improvement by 
applying repeated learning steps, since synapses were switched to 
an inactive state too quickly (Figure 5C). 

ONLINE LEARNING 

So far, the initial distribution of pattern sizes was centered at 
mean values far above the fixed point M = ho/Cm- However, once 
the pattern size distribution has reached this optimal value, ret- 
rosynaptic LTD will only take place if a new association with an 
oversized pattern is added into the synaptic matrix. In our model, 
this can be simulated as a homogeneous sequence with one pat- 
tern of size larger than M = ho/c,„, as illustrated in Figure 6: for 
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FIGURE 5 I (A) Replay success rate over time for different firing 
thresholds 6 and a sequence of length Q=100. These plots were 
obtained using the mean field model, and were verified using cellular 
simulations. Left to right: increasing plasticity iterations. Top to bottom: 
increasing pattern load P. The initial pattern size distribution had 
parameters 0o = 0.02 and cr,^/0o = 10%- Other parameters were: 
W=10^, Cm = 0.1. (B) Connectivity decreases as the network sparsifies 
its stored associations. This plot was obtained by simulating the actual 
neural network with three different pattern loads P and a randomly 



generated coding ratio vector <l>. The connectivity was calculated both 
using the mean field equation (2) {blue) and counting the actual 
number of potentiated synapses {red dots), showing a perfect match. 
(C) Advantage of metaplasticity {left) over simple binary synapses {right) 
during retrosynaptic LTD. Blue and red traces indicate hits and false 
alarms (as in Figure 4) for 0, 5 and 10 learning steps. The bottom row 
depicts the replay quality of the 100th pattern in the sequence as a 
function of the number of learning steps. Only with metaplasticity the 
replay remains stable for many learning steps. 



low firing thresholds {6 = 26), the excess synaptic drive gener- 
ated by an oversized pattern initially leads to sequence termina- 
tion by setting the network into an epileptic state. Plasticity via 
retrosynaptic signals gradually reduces the size of the problem- 
atic pattern, eventually allowing for successful replay of the full 
sequence. This shows that retrosynaptic LTD in principle makes 
it possible to integrate new associations into the network, and 



therefore provides a possible basis for online learning, i.e., the 
ongoing storage of new memories. 

Of course, adding new associations (increasing P) will conse- 
quently also increase the mean connectivity c, up to a point at 
which classically memories can no longer be retrieved (Willshaw 
et al, 1969; Nadal, 1991; Kammerer et al, 2013). For these large 
connectivities c the false alarms add considerable synaptic inputs 



Frontiers in Synaptic Neuroscience 



www.frontiersin.org 



June 2014 | Volume 6 | Article 13 | 8 



Medina and Leibold 



Re-encoding by recurrent plasticity 



0 



2000 
1000 
0 

( 

2000 1 
1000 



50 

Time step 



100 



0 10 20 30 40 50 
Plasticity iteration n 



^ 0 
> 

■■s 1 

CO 
CD 

i 0 



m,/M, 
n,/ (N-M,) 



n = 0 



7t=5 



7t= 10 



Time step 



100 



FIGURE 6 I (A) Homogeneous sequence with a double-sized pattern at 
f = 50. (B) Evolution of oversized pattern with plasticity {q= 0.1). (C) 
Initially, sequence replay fails at f = 50 because of the excessive synaptic 
drive generated by the oversized pattern, which leads the network to an 



epileptic state (fop). After 5 iterations, the network explosion is slower but 
still present {middle). Successful replay of the full sequence is possible 
after 10 iterations {bottom). Parameters: N=^0^, Cm = 0.1, c=5%, 
(An = 0.01, e = 26. 



such that a neuron is no longer always able to correctly decide 
whether it should tire or not. Using our present model of retro- 
grade LTD, however, neurons could detect such over- excitation 
and may subsequently depress synapses. 

To investigate whether this mechanism allows for self- 
organized sequence replay in a steady state, we set up a simulation 
in which we add sequences of 7 new patterns before each plasticity 
episode and monitor the retrieval quality as well as the mean con- 
nectivity. The dynamics of the connectivity c and c„ y^. is thereby 
simulated according to Section 2.4. 

The result of one such simulation is summarized in Figure 7. 
The simulation starts with an empty network, i.e., all synapses 
are in state 0. Each time after storing a new sequence, the newest 
60 sequences (if already available) are replayed starting with per- 
fect initialization of the first pattern, m = M and n = 0. These 
replays induce retrosynaptic LTD. Before the network has reached 
a steady state, replay is generally successful for aU sequences 
(Figure 7A) and is worse for the last recalled patterns in younger 
sequences, because there the pattern sizes have not yet converged 
to their optimum ho/c„, = 1000. This is because oversized pat- 
terns tend to evoke dynamical instabilities that lead to many false 
alarms and bad replay quality. After the network has reached a 
steady state, the first of the 60 replays generally fail, whereas the 
younger sequences can be replayed at high quality (Figure 7B). 
Interestingly, the mean connectivity c converges to its steady state 
more quickly than the replay dynamics (Figure 7C). The pattern 
sizes are slightly above their optimum /lo/cm (Figure 7D; note 
that each 7th pattern is the final pattern of each sequence and does 
not shrink according to the learning rule. These tinal patterns stay 
at their initial size). Only for the newest patterns the sizes reflect 
the initial distribution (here a uniform distribution between 1000 
and 2000). 

The r values that measure the remaining memory strength of 
an association (see Methods), provide an additional view on the 



memory capacity of the network; Figure 7E. Their convergence to 
zero for old memories reflects the memory time scale of the net- 
work. Additionally, the approach to the steady state is made visible 
if one monitors the r value of the oldest memory (ri) over time 
(Figure 7F). The convergence of r is much slower than the conver- 
gence of the mean connectivity c (Figure 7C), explaining why the 
replay dynamics further changes long after the mean connectivity 
has reached its steady state. 

DISCUSSION 

Fast hippocampal activity sequences have been hypothesized to 
underlie memory consolidation (Ego-Stengel and WUson, 2010; 
MoIIe and Born, 2011; Jadhav et al., 2012). On the cellular level, 
the associated re-encoding of episodic memories can either occur 
at the synapses between hippocampus and neocortex (Buzsaki, 
1996; Frankland and Bontempi, 2005) or within the hippocampus 
itself So far, hypotheses for hippocampus-intrinsic consolidation 
were mainly focusing on synaptic mechanisms (Prey and Morris, 
1997; Milekic and Alberini, 2002; Papper et al, 201 1 ). The present 
paper provides a mechanistic model of memory re-encoding on 
the circuit level whereby associations between assemblies of neu- 
rons are not strengthened over time, but assemblies are reduced 
in size to utilize the hippocampal resources more efficiently. 

Retroaxonal learning affecting both input and output synapses 
of a neuron has been suggested to aid stabilization of recent mem- 
ories previously (Harris, 2008), although only in the context of 
synaptic potentiation. There, neurotrophins have been hypoth- 
esized to constitute a plausible underlying biochemical pathway. 
Here, we suggest a specific functional role for a retroaxonal spread 
of depression and have shown that it may allow a network to 
operate in an online mode where old memories are overwritten 
by new memories. Moreover, the suggested retrograde LTD pre- 
dicts that depression in output synapses should be correlated with 
depression in input synapses. 
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FIGURE 7 I Online learning. (A) Replay of the 60 most recent 
sequences before the network has reached a steady state. Blue and red 
lines depict hits and false alarms, respectively, black line indicates the 
retrieval quality r. (B) Same as A after the network has reached its 
steady state. (C) Mean connectivity c as a function of time. (D) Pattern 
sizes. Blue dots indicate the sizes of the last (7th) pattern of each 



sequence. (E) The remaining memory strength r of all associations at 
the end of the simulation (500 sequences; 3500 patterns). (F) The 
memory strength of the oldest association as a function of simulation 
time. Parameters of the simulation were W = 4-10'', Cm = 0.1, t) = 0.03, 
e = 30, ho = 100, a =10"^. The simulation was terminated after having 
stored 500 sequences of length 7. 



A different mechanism suggested to reduce the overall excita- 
tory drive in a network is synaptic scaling, whereby all synapses 
of an overexcited neuron undergo LTD (Turrigiano et al., 1998; 
Watt et al, 2000; Turrigiano, 2008; Savin et al, 2009). Retroaxonal 
learning is a more content-specific mechanism than synaptic scal- 
ing since it only affects synapses that have been active in the recent 
past and thus generally accounts for longer retention times. 

Previous models of online learning (Amit and Fusi, 1994; 
Fusi et al., 2005; Ben Dayan Rubin and Fusi, 2007; Leibold and 
Kempter, 2008; Amit and Huang, 2010; Huang and Amit, 2011) 
usually do not explicitly take into account the network dynam- 
ics underlying the induction of plasticity. This paper presents a 
hypothesis of how LTD can be derived from network dynamics. 
The initial imprinting of the memories by LTP is still ad-hoc since 
we assume it to be occurring via extra-hippocampal signals. 

Several other theoretical explanations for sequence replay and 
the sharp-wave ripple state have been suggested. (1) Sequences 
can be seen as avalanche-like activity patterns that are ampli- 
fied by dendritic non-linearities (Memmesheimer, 2010; Jahnke 
et al., 2012, 2013). (2) CAl pyramidal cell spike patterns may be 



triggered by strong feedforward excitation from CA3 inputs that 
are temporally coordinated by fast recurrent inhibition (Ylinen 
et al, 1995; Geisler et al., 2005; Taxidis et al, 2012). (3) The rip- 
ple oscillation may result from a network of gap-junction coupled 
axons (Traub et al, 1999; Traub and Bibbig, 2000; Vladimirov 
et al., 2013). (4) Sequences may result from a few overlapping 
attractor states in a recurrent network of neurons (Azizi et al., 
2013). So far, these models are hardly evaluated with respect to 
their memory capacity (although coding capacity was probed in 
Azizi etal, 2013). 

High memory capacities have been found in classical models 
of memory networks, developed independently of the hippocam- 
pal physiology, that suppose neuronal sequences to result from 
attractor networks with asymmetrically biased synaptic matri- 
ces (Dehaene et al., 1987; Buhmann and Schulten, 1988) in dis- 
crete time. One major drawback of these classical theories as well 
as the model presented here is their formulation in discrete time, 
which makes them hard to connect to cell-physiological proper- 
ties of pyramidal cells. On the cellular level, sequence replay is 
most likely associated with the presence of huge precisely timed 
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excitatory and inhibitory synaptic conductances (Maier et al., 
2011). Whether and how under such conditions a neuron can fire 
and, more specifically, can select to fire at one specific oscillation 
cycle during a ripple, remains to be shown. 

Sparsification of the hippocampal code may be an impor- 
tant intermediate step to prepare consolidation of memories in 
the hippocampal-neocortical loop, since generally storage capac- 
ity increases with sparseness (Nadal, 1991; Leibold and Kempter, 
2006) and associating large hippocampal assemblies with neocor- 
tical states might be too costly. On the other hand, initially large 
assemblies might have the advantage that new associations can be 
retrieved more robustly. Optimal sparseness cannot be obtained 
by translating from one brain area to another via a random con- 
nectivity matrix, since then associations get lost as they may fall 
in the lower tail of the statistical distribution of the number of 
synaptic connections and thus do not give rise to sufficient excita- 
tion in the downstream brain area. Optimally sparse codes, hence, 
always require additional plasticity rules that carve out the sub- 
set of neurons that can fire reliably. The activity- driven increase 
in sparseness could also explain the prevalence of a few dominant 
preplay sequences (Dragoi and Tonegawa, 2013) that may provide 
an easily addressable substrate for future associations. Our model 
predicts that, once these sequences are connected with a memory 
item, the internal representation becomes more sparse and the 
sequences are no longer spontaneously visible. However, they are 
nevertheless stored within the hippocampal synaptic matrix and 
can be retrieved upon presentation of appropriate cue patterns. 
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